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Abstract 

We consider a multivariate distributional recursion of sum-type as 
arising in the probabilistic analysis of algorithms and random trees. We 
prove an upper tail bound for the solution using ChernofFs bounding 
technique by estimating the Laplace transform. The problem is traced 
back to the corresponding problem for binary search trees by stochastic 
domination. The result obtained is applied to the internal path length 
and Wiener index of random &-ary recursive trees with weighted edges 
and random linear recursive trees. Finally, lower tail bounds for the 
Wiener index of these trees are given. 

Key words: random trees, probabilistic analysis of algorithms, tail bounds, 
path length, Wiener index 



1 Introduction 

Many parameters of recursive algorithms, trees or other recursive structures 
can often be described by a so-called recursion of sum type 

b 

X n ±Y. Mln)xf\ + d(I n , Z) (n > 2) (1) 
i=i 

where X n , ■ ■ ■ ,X n have the same distribution as X n , d : K 6 x H b — > m fc 
and Ai : R b -»■ R kxk are deterministic functions, I n — (In,ii ■ ■ ■ jln,b) £ 
{0, ...,n — l} b and Z £ R^ are random vectors with E[d(I n , Z)] = 0, 

and Xn , ...,Xn , I n , Z are independent. By = we denote equality in 
distribution. 
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From the algorithmic point of view, such a recurrence arises by considering 
so-called divide and conquer algorithms. Let Y n denote the parameter of 
interest of the algorithm applied to a problem of size n. The algorithm splits 
the large problem into b subproblems of the smaller sizes I n> i, ■ ■ ■ , / n &. If the 
considered parameter Y n is essential given by the (possible weighted) sum 
of the corresponding parameters of the smaller subproblems, for a matrix 
C n the vector X n := C n (Y n — E[Y n ]) suffices the recurrence (P) where the 
coefficients Ai(I n ) are the weights of the subproblems (scaled by C n and 
Cj n . ) and the additional function d gives the cost for splitting the problem 
in this manner and merging the solutions of the subproblems to a solution 
of the size n problem. The vector Z attends more universality. 
One famous example for a parameter satisfying recursion ([1]) is the dis- 
tribution of the number of comparisons made by quicksort which is equal 
in di stribution to the internal path length of the random binary search 
tree. iMcDiarmid and Havwardl (119961) used ma rtingale dif ference methods 



to sho w upper tail bounds for it. iRoslerl (jl992l ) as well as iFill and Janson 
(J200l|) obtained upper bounds for its Laplace transform by induction. Hav- 
ing upper bounds for the Laplace transform, they concluded upper bounds 
for the tails of the distribution by ap plication of Chernoff's bounding tech- 



Ali Khan and Neiningerl (J2007) generalized this procedure to the two- 



mque. 

dimensional recursion for the Wiener index and the inter nal path length of 

the ra ndom binary search tree extending their technique in lAli Khan and Neininger 
(2004) for the analysis of tail bounds for the complexity of a randomized 
algorithm to evaluate game trees 



In this paper, we apply the method of lAli Khan and Neiningerl (|2007l ) to 
multivariate functionals satisfying recursion (JTJ) where the operator norm of 
the coefficient matrices Ai can be stochastically bounded in a special way. 
We denote by ^ s t the stochastic order and by U a random variable uniformly 
distributed on [0, 1]. The fundamental result of this paper is the following 
theorem. 



Theorem 1.1. Let X n be a solution of the distributional recursion (JTJ). 
Assume that X\ = 0, ||d(/ n , Z)\\ < D almost surely for all n £l and for a 
constant D 6 R and that 



£ 

i=l 



Min)\\L i*i-u(i-u) 



as well as ||-Ai|| p < 1 for all i G {1, . . . , b}. Let 7 « 2.0047 be the positive 
solution of 12/7 = e 2 / 7 — 2/7 and Lq ~ 5.0177 be the largest root of e L = 
6L 2 . Then we have for all t > 0, n £ IN and any component X n j of X n 
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(j G {1, . . . , k}) with C := 48£>/ 7 + 0^48(48/^ ~ 5 ); 



P (X nJ >t)<{ 



cxp 



10 7 2 D 2 



eX P I I - ^ 
exp (24Lg - %t) , 



The same bounds hold for the left tail P(X nj 



ifO<t< 5-fD, 
if 5<yD <t<C, 
ifC<t< 48DL , 
if48DL < t < 4De Lo , 
if 4De L ° < t. 

<-t). 



As an application of Theorem 11.11 we obtain upper tail bounds for the 
distribution of the internal path length and the Wiener index, in random b- 
ary recursive trees with weighted edges by showing the stochastic domination 
condition. The distance between two nodes in a tree is defined as the number 
of edges on the unique path between the two nodes. Then, the internal path 
length of a rooted tree is the sum of all node depths of the tree where the 
depth of a node is its distance to the root, and the Wiener index is the sum 
of the distances between all unordered pairs of nodes. 

The 6-ary recursive tree with weighted ed ges can be considered as_a spe- 
cial case of the tree model in the paper of iBroutin and Devrovd (|2006l ) in 
discrete time where the lifetimes of the edges are independent exponentially 
distributed random variables . The shape of the ran dom tree is also obtained 
as an increasing tree due to iBergeron et al.1 (119920 and is a special case of 
the general model of random trees in IBroutin et al.l ((2008). 

Theorem 1.2. Let Y n := (W n ,P n ) T denote the vector consisting of the 
Wiener index and the internal path length of a random b-ary recursive tree 
of size n with edge weights Z where \\Z\\ is bounded almost surely. Then 
there exists a constant D such that we have in the recursive formula (Q]) for 
X n given by 

' 1 o" 

X n := ^ L (Y n -E[Y n \) 

n 

< D and the bounds d5J of Theorem li.il are valid 



almost surely 
for 



(In,Z) 



P 



W n - E[W n ] 



> t 



n- 



and 



P 



E[P n 



> t 



n 

as well as for the corresponding left tails P(X n j < —t) (for j = 1,2). 

Using the asymptotic expansion of the expectation of the internal path 
length and the Wiener index, the following asymptotic tail bounds are ob- 
tained. 
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Corollary 1.3. Let P n denote the internal path length and W n be the 
Wiener index of a random b-ary recursive tree of size n with edge weights 
Z where \\Z\\ is bounded almost surely. Then, there exists a constant D > 
such that for t > and re — > oo it holds 

P(\P n -E[P n }\ >tE[P n }) 

<exp (- - ^ z~ t logw (log (2 ) n + logt + a + 

and 

P(\W n - E[W n )\ > tE[W n }) 

< exp j^tlogn ^log^ 2 ^ re + logt + a + 

where fi = E[Z±] and a := log (bfj,/(AD(b — l)e)). 

Fi nally, by special ch oices of the edge weights and the use of transfer results 



m 



Munsoniua (|2010bl ). the corresponding bounds for random linear rec ursive 



trees a re obtained. The model of linear recursive trees is introduced bv lPittel 



(1994). Starting with the root, the linear recursive tree grows node by node. 
In each step the new node is attached to a randomly chosen node of the 
previous ones. The probability that node u is chosen is proportional to 
the weight w u = 1 + /3deg(u) where deg(u) is the number of children of u 
and (3 G Il>o is the parameter of the tree. This tree model encompasses 
as special cases the random recursive tree {(3 = 0) and the plane oriented 
recursive tree (/3 = 1). 

Corollary 1.4. Let P n denote the internal path length of a random linear 
recursive tree of size n with weight function 1 + (6 — 2) deg(u) for b € IN 
and b > 2. Then there exists D > such that for t > and n — > oo we have 
for (P n — E[P n ])/n the same tail bounds as in Theorem \1.2\ and in particular 
we have for t > and n — > oo 

P(\P n -E[P n }\ >tE[P n )) 

< exp (— - 1 ^ -1^ log re (log( 2 ) re + logt + a + 

with a := - log (4D(b - l)e) . 

Corollary 1.5. Let W n denote the Wiener index of a random linear recur- 
sive tree of size re with weight function u4 1 + (b — 2) deg(n) for b E IN and 
b > 2. Then there exists D > such that we have for t > and re — > oo 

P(\W n -E[W n ]\>tE[W n }) 
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< 



exp 



j--^ilogn (log (2 ) n + logt + a + 



with a :- 



log (4D(6 - l)e) 



Using the WKB method iKnessl and Szpankowskil (j 19991 ) argue for very 
sharp bounds for the tail of the limit d istribution of the interna l path 
length of random binary search trees. In iRuschendorf and Schoppl (|20Q7l ) 
general upper bounds for tails of distributions given by a recursion of sum 
type are shown in the one-dimensional case. For simply generated trees, 
asymptotics for the right tail of the limit distribution of the tota l path 
leng th and the Wiener in dex are shown in IChassaing and Jansonl (|2004l ) 
and iFill and Jansonl (l2009h . 

This paper is organized as follows. In section [21 we consider the general 
recursion formula ([1]) and give a proof for the upper tail bound in Theorem 
1.11 The 6-ary recursive tree with weighted edges is defined in section [3l We 
then show the stochastic domination condition in this case by a coupling 
argument and conclude Theorem 11.21 and Corollary 1 1.31 in s ection 13.11 Fi- 
nally, we conclude by transfer results from Munsonhisn 2010b ) the upper tail 
bounds in case of random linear recursive trees (Corollary 11.41 and Corollary 
1.5p in section 13.21 At the end, we give a summary of corresponding results 
concerning lower tail bounds for the Wiener index in section [U 
We denote by || • || the Euclidean norm in H k and by || • || op the operator 
norm for matrices. Equality in distribution is written as =. For functions / 
and g we write / = o(g), f = O(g) and / = Q(g) if lim^oo f{n)/g(n) = 0, 
\f(n)/g(n)\ < C and c < \f(n)/ g(n)\ < C for all n with some constants 
< c < C < oo respectively. 
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2 Upper tail bound for a general recursion 

We consider a random A;-dimensional vector X n = (X nt i, . . . ,X n ^) which 
solves the distributional recursion formula 



X n ^J2Mln)xf\+d(I n ,Z) 



where Xn^ , . . . , X^ have the same distribution as X n , d : R fe x 



E fc and A- : E b 



M kxk are deterministic functions, Z 6 R> and I n 
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{I n ,li • • • j In,b) £ {0, . . . , n — l} b are random vectors with E[d(I n , Z)] = 0, 

/„, Z are independent. 
We denote by ^ s t the stochastic order and by U a random variable uniformly 
distributed in [0, 1]. 

Lemma 2.1. Let X n be a solution of the distributional recursion ([T]). As- 
sume that X\ = 0, ||d(/ ra ,Z)|| < D almost surely for all n G IN" and for a 
constant Dg£ and that 

b 

J2 WMQfov ±Bti-u(i-u) (3) 

i=l 

as well as ||-Ai|| p < 1- Let 7 « 2.0047 6e i/ie positive solution of 

— = eT ana it = — D 7 . 

7 7 2 ' 

Then we have for all s G R fc ||s|| < \/{^D) and for all n € IN 
£[exp((s,X n »] <exp {K\\s\\ 2 ). 



Proof. We show the claim by induction on n. For n = 1 we have Xi = 
and there is nothing to show. 

Using the recursion formula and the given independence we get for n > 2 
E [exp ((*,*„»] 



= E E 

ajGlO,...^-!} 6 



1=1 



exp(( a ,X;^(J n )xW i + d(I n ,^)\ 

6 

l\E [exp (((^(x)) T S ,Xg)) I I n = z] P(I n = x). 



i=l 



The assumption ||^4j(x)|| p < 1 implies ||j4j(x) T s|| < ||s||||^4i(x)|| p < ||s||. 
Since for every i G {1, . . . , 6} we have X{ < n — 1 we can apply the induction 
hypothesis. Therefore, we obtain 

£[exp« S ,X„»] 

< E[e W ((s,d(x,Z)))]e W K\\s\\ 2 \\Mx)f \ P(I n = x) 

xe{0,...,n-l} b \i=l I 

exp ((a, d(I n , Z))) exp [k\\s\\ 2 £ Pi(/„)||^ 



(4) 
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(5) 



By condition ^ and monotonicity of x i— > e Xx we conclude 
E [exp ({s, X n })] < E [exp «s, d(I n , Z))) exp {K\\s\\ 2 {\ - U{\ - 17)))] 

= e*NI 2 £ [exp (( S , d(I n , Z))) exp (-^|| S || 2 [/(1 - U))] 

Hence, using the Cauchy-Schwarz inequality it suffices to show that 
(E [exp(( S ,d(/ n ,Z)))exp(-7C|| S || 2 [/(l - U))]) 2 

< E [exp (2 (s, d(I n , Z)))) E [exp (-2K\\s\\ 2 U(l - U))] (6) 

< 1. 

By assumption ||d(J n , 2T)[| < D holds almost surely and E[d(I n , Z)] = 0. 
Thus, we get for ||s|| < 1/(7.0) 



E[exp(2(s,d(I n ,Z)))] = l + E 



2 ^2 k (s,d(I n ,Z)) 



(s,d(I n ,Z)) 2 Y, 



fc-2 



k=2 



<l + \\s\\ 2 D 2 J2 



k=2 



k-2 



1 + ||s|| 2 £>V I - 1 ) . 



(7) 



For all x > we have 



e~ x < l-x + — . 

2 



This yields for the second factor in 



E 



-2E-||s|| 2 ;7(i-;7) 



< E [1 - 2K\\s\\ 2 U(l -U) + 2K 2 \\s\\ i U 2 {\ - Uf 



(8) 



With ([7]) and ([8]) we see that (J6j) will follow from 



1 + ||s|| 2 L>V ( e" - 1 



7 



l-\K\\sf + ^K 2 \\st) ^ 1 



This in turn is equivalent to /(||s||) < for 



/(IHI):=^ ei-1 



7 



-K 



KD 2 j 2 I - 1 
3 'V l) 15 



if 2 ) IH' 2 
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+ L&jfif (f - 1 - ~) INI 4 - 0) 

We substitute K = 5/2D 2 ^ 2 and e 2 ^ - 1 - 2/7 = 5/7 and obtain 

/(HI) = (f - § + - I) PtINI) 2 + |p7lNI) 4 ) • 

We see that /(0) < and f(l/{^D)) = 0. Since / is a biquadratic function 
in || s || with a positive coefficient corresponding to ||s|| 4 and /(0) < it 
has at most two real roots. On the interval between these two roots the 
function is negative and outside this interval the function takes only positive 
values. Since f(l/(-yD)) = we therefore get /(||s||) < for all s with 
< ||s|| < l/(7-D). □ 

Lemma 2.2. Let X n be a solution of the distributional recursion ([T]). As- 
sume that X\ = 0, ||d(I n , Z)|| < D almost surely for all n € W and for a 
constant D £R and that 

b 

Y.WMln)\\l p < st l-U{l-U) 
i=l 

as well as ||^4j|| p < 1. Let 7 ~ 2.0047 be the positive solution of = 

2 

— - and Lq ~ 5.0177 be the largest root of e L = 6L 2 . Then we have for 

1/(72)) < HI < L 

E[exp({ S ,X n ))] <exp (K L \\s\\ 2 ) 

where 




2AD 2 , for l/( 7 £>) < L < L /D, 
±i e LD , for Lq/D < L. 



Proof. We again use induction on n. For n = 1 there is nothing to show. 
We use the same arguments as in the beginning of the proof of Lemma 12.11 
and get ([5]): 



E 



< e K L \\sf E [ exp ^ d(/n) z))) exp (- KL \\ a fu(l - U))] 



for a random variable U which is uniformly distributed on [0,1]. Hence, 
it suffices to prove © under the new assumptions. Since ||d(I n ,Z)|| < D 
almost surely the proof is completed by showing 



e D\\s\\ E 



-K L \\s\\ 2 U(l-U) 



< 1. 
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Fill and Jansonl (|200ll . Section 4) proved that for any K > 

1 - exp (-K^f- 
E [exp {-2K\\s\\ 2 U{l - U))] < -A^ '- (10) 



and for 0.42 < |A| < M 



|A| 



1 - exp (-K M %) 
^ < 1 



when Km 



12, 



for M < L , 



2e M /M 2 , for L < M. 



11 



In the present situation, it follows 



e DH E 



-K L \\sfU(l-U) 



< e 

< 1 



D||s| 



i ( K L D 2 \\s\\ 2 

1-exp^-^^L 

K L £> 2 IN1 2 
2D 2 2 



when I/7 < D\\s\\ < LD and 



K L 



24D 2 



1 „LD 



for L < L /D, 
for L /D < L. 



Thus, we obtain the claim because it is I/7 > 0.42. 



□ 



We summarize the results of the two preceding lemmas. 

Corollary 2.3. Let X n be a solution of the distributional recursion ([I]). 
Assume that X\ = 0, [|d(J n , < D almost surely for all n € IN and for a 
constant DeR and that 

b 

^11^(^)11^ 1-17(1 -C/) 

i=l 

as well as ||-Aj|| op < 1. Let 7 ~ 2.0047 be the positive solution of 12/7 = 
e 2/7 _ 2/7 and Lq ~ 5.0177 6e £/ie largest root of e L = 6L 2 . Then we have 
for every s and n > 1 



E[exp({s,X n ))\ < < 



exp (|7 2 L> 2 ||s|| 2 ) , for < ||s|| < l/(7-D), 
exp (24£> 2 ||s|| 2 ) , for 1/(7-0) < ||s|| < L /D, 
exp (4e D H s ll) , /or L /D < \\s\\. 
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Proof. The bounds for ||s|| < Lq/D follow immediately from Lemma 12.11 
and Lemma |2, 21 Since the function x h-> e Dx /x 2 is monotonically increasing 
on the interval [Lq/D,oo), Lemma 12.21 yields also the bound in the case 
||s|| > Lq/D. □ 

Now, we get the tail bound for any entry of the vector X n . 

Proof of Theorem We denote by ej the vector with 1 in the j-th entry 
and elsewhere. We use Chernoff 's bounding technique and obtain for u > 
and j € {1, . . . , k} with Corollary 12.31 



P (X n>j >t) =P (exp (uX nJ ) > exp(wi)) 

< E [exp (uX n j — ut)] 

= E [exp (u (ej,X n ) - ut)] 

< exp (K u u 2 — ut) , 



where 



K u 



\-y 2 D 2 , for 0< u < l/(7-D), 
24-D 2 , for 1/(7-0) < u < Lq/D, 
4^, for Lq/D < u. 



V u 

For the left tail we receive analogously 

P(X n j < -t) = P (exp (uX n j) < exp(-ui)) 

< E [exp (—uX n j — ut)] 

= E [exp (-u (ej,X n ) - ut)] 

< exp (K u u 2 — ut) . 

In order to minimize this bound we are looking for the minimum of the 
function f{u) := K u u 2 — ut. This function takes its minimum at Ui(t) and 
has the value f(ui(t)) for 

t t 2 5 

Mt) = 4W' fiMt)) = " 9^ fOT Ku = 24j ° 2 ' 

Mt) = ^ log -L , f(u 3 (t)) = l-l\ og ± for K u = 4^- 

where Ui(t) G fT* with U x := [0,1/(7-0)], U 2 := (1/(7-0), L /-D] and C/ 3 := 
(Lo/D.oo). 

If Uj(t) C/j for a given t, we can take u at the proper boundary of C/j. 
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Comparing the different values of the minimum for % = 1, 2, 3 we obtain the 
total minimum. For t G [0, 57-D] we have the following possibilities: 



t „ t 2 



5L> 2 7 2 ' Jy Ly " 10 7 2 L> 2 

1 24 t 



The minimum is given for ui(t). 

Similarly, we obtain the minimum in the other cases by making the following 
choices: 



for t G [57D, 48D/7 + Dy/48^48/^ 2 - 5) 

for t G [48D/7 + D\/48 ^/t 2 ~ 5 > 48Z)L ) 

Mt) = ^, K U = 24D 2 , f(u 2 (t)) 



48L> 2 ' u ' M '™ 96D 2 ' 

for f G [A8DL ,ADe L °) 

u 2 = jj, K U = 24D 2 , f(u 2 ) = 24L 2 - ^t, 

and for t G [4L>e io ,oo) 



1 t 4 e M 3(t) t t t 



□ 



3 Applications to random trees 

An example of a vector which satisfies the recursion formula ([I]) is the vector 
consisting of the internal path length and the Wiener index of a random tree 
in which all subtrees are (conditioned upon their sizes) an independent copy 
of the whole tree. 

The internal path length of a rooted tree is the sum of all node depths of 
the tree. The depth of a node is given by the number of edges on the path 
from the node to the root. Analogously, the Wiener index is the sum of the 
distances between all unordered pairs of nodes where the distance is given 
by the number of edges on the unique path between the two nodes. 
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3.1 The random fc-ary recursive trees with weighted edges 



In this section we consider the special case of a random 6-ary recursive tree 
with weighted edges. 

The random 6-ary recursive tree is a rooted, ordered, labelled tree where the 
outdegree is bounded by b and the labels along each path beginning at the 
root increase. We define this tree model by the following recursive procedure. 
We consider the infinite complete 6-ary rooted, ordered tree and start with 
the root as the first internal node and its b children as external nodes. Given 
the random b-ary recursive tree with n internal nodes, the n + 1st internal 
node is added in the following way. We choose a random node uniformly 
distributed on the set of all current external nodes, change it to an internal 
one and add the b children of this new node to the set of external nodes. 
Finally, the nodes are labelled in the order of their appearance. 

Let Z := (Zi, . . . , Zj,) G R> be a random vector with non-negative entries 
and attach to every node u of the complete infinite 6-ary tree an independent 
copy Z^ of Z. We consider the entries of Z^ as weights of the edges from 
u to its b children. If all Z^ are independent of T n , we refer to T n supplied 
with the family {Z^} as a random b-ary recursive tree with edge weights 
Z. 

While the entries of the vector Z may depend on each other, we assume that 
they are identically distributed, i.e. for all i,j £ {1, . . . , b} we have Z{ = Zj, 
and denote its expectation by fx := E[Z\]. This assumption is not restrictive 
for t he intended limit t heorems as can be seen by a permutation argument 
(see iMunsoniua . l2010al . p. 14-15). For instance, the shape of the random 
binary search tree is equally distributed as the shape of the random 6-ary 
recursive tree with egde weights (Z±, Z<i) = (1,1) for 6 = 2. 

Let Y n = (W n , P n ) denote the vector consisting of the Wiener index and the 
internal path length of the rando m 6-ary recursive tree of size n with edge 
weights Z. In iMunsoniusI (|2010bl ) it is shown that the vector 



X„ 



h 



(Y n - E[Y n }) 



(12) 



satisfies the recursion formula ([T]) where the matrices Ai(I n ) are given by 
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and the vector d(I n , Z) is given by 



4 n) = E — lQ g — + E (V* + + + (i) 

6—1 n n \ 2 6 — 1 / re n 

(13) 



i=l i^j 

and 



6 r r 6 



4 n) = r^TM E — lo S — + E S^ + o(l). (14) 
6 — 1 re n n 

i=l i=l 



To apply the result of the previous section, we have to prove the stochastic 
domination condition for the 6-ary recursive tree. 

3.1.1 Coupling 

For (xi, . . . , Xb) G K 6 we denote by (%\, . . . , x^)) the order statistic, i.e. 

£C(1) > x (2) > • • • > x (b) 

and the entries of (xi, . . . ,Xb) 6 IR b and (xm, . . . ,x^) are the same. We 
consider the space R fe with the partial order given by 

(xi, . . . ,x b ) < (y 1 , . . . ,y b ) :<^> Xi < j/ f for all i € {1, ... , 6} 

and define := {(xi, . . . , x&) G R> | X\ > X2 > • • • > x^}. Moreover, 
we denote by PU{b) a Polya urn with balls of 6 different colors {1, ... ,6}, 
which contains at the beginning 1 ball of each color and after a ball of color 
j is drawn, it is returned to the urn together with another 6—1 balls of the 
same color. 

Considering the evolution process which yields the random 6-ary recursive 
tree, it is not difficult to see, that the vector of the sizes of the subtrees 
has the same distribution as the vector of the numbers of drawings of a 
ball of the differen t colors in the urn described above (for more details see 



Munsoniusl . l2010al . Section 2.2). Using this, the next two lemmas provide 



the estimate we need. 

Lemma 3.1. For j £ {1, . . . , 6} let J n j denote the number of times that 
the drawn ball is of the color j during the first n drawings of the Polya urn 
PU(b) and I n j the corresponding size for the Polya urn PU(b + 1). Then 
we have for the vectors J n := {J n ,i, • • • , J n ,b) and I n := (I n>1 , . . . , I n ,b+i) 

(^n,(l)> • • • i-*n,(6)) ^st {J n ,(l)i ■■■ > Jn,(b))- 
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We prove this lemm a by using a resu lt about stochastic domination between 



Markov chains (see iLindvalll . Il992l . Section IV. 5, Theorem (5.8)). With 



ej E HI 6 we denote the vector where all entries are except the j-ih entry 



which is 1. 



Proof. It suffices to show that there is a coupling of I' n := (I n h), • • • , I n ,(b)) 
and J' n := (J n> m, ■ ■ ■ , J n ,(b)) such that I' n < J' n almost surely. 
The sequence J' n resp. I' n is a Markov chain. To write down the transition 
probabilities we define aj : E b — > INq by 



aj(xi, . . . ,x b ) :- 



\{i | Xj = Xi}\, if Xj-i > Xj, 
0, otherwise. 



Thus, the transition probability for J' n is given by the kernel K n : E b x E b 
[0, 1] with 

. 1 + ^(6-1) . . 



b + n(b-l) 



for x = {x\, . . . , Xb) G Eb and j = 1, . . . , b. For the transition probability of 
I' n we get the kernel K' n : E b x E b — > [0, 1] with 

1 _|_ X Q 

K' n (x, x + ej ) := P(l' n+1 = x + e s \ l' n = x) = & + nb a i( x ) 



1 + ( n - J2 b i= i x i ) b 



for j = 1, . . . , b and 

K n (x,x) := P(l' n+1 = x | I' n „, ^ ^ ! ^ ^ 

Let x,y £ E b with y < x. We claim that K' n (y, •) is stochastically dominated 

by K n (x,-). 

If 

P(/; +1 = y + ei | j; = y)>0 

we have ay(y) / 0. For yj < we get y + ej < x. Thus, we only have 
to consider the case where ctj(y) / and yj = Xj. Let ji, ■ ■ ■ ,j m be the 
components for which ctjJy) ^ and Xj l = yj l for 1 < I < m. Then we have 
> a ji(y) because > yj t -i > yj l = Xj r Since Xj l < n we get 

l + Xji (b-\) > l + X]i b 



b + n(b-l) ~ b + l + nb' 

This yields for all I G {1, . . . , to} 

. . 1 + x 7 -, (6 — 1) l + x,,fo . . 

K n (x,x + e jt ) = b + ^ b _^ a jt (x) > b + 1 + nb aj l (y) = K n {y,y + ej 
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For i G {1, . . . , b} \ {ji, . . . ,j m } we obviously have 

K n (x, x + a) > = K' n (y, y + a). 
Hence, we can find a coupling (J ra+ i,/„ + i) of (Jn+i^n+l) w ith 
P{ln+l < Jn+l I (JnJn) = (x,y)) = 1 

for all x,y £ E b with y < x. This implies that K n (x, •) dominates stochas 



tically K' n (y, •). Because of the Markov property we conclude with iLindvall 



(|1992l . Section IV. 5, Theorem (5.8)) that there exists a coupling (J, I) of J' 



and such that I n < J n almost surely for all n G IN. □ 
Lemma 3.2. Let f : [0, 1] — > K, be the function given by 



f(x) = x 4 + (x 2 - x 3 ) ( 1 + Vx 2 + 1 



Then, for (xi,...,x b ) G -Eb and (yx, . . . ,y b+1 ) G mt/i Ya=i x 

Y^jtiUj = 1 and (yi,...,y b ) < (x 1 ,...,x b ) we have 

6+1 b 

E/(w)<E/te). 

i=l i=l 



Proof. We first show that the function / is convex. To do this, we derive 
the second derivative which is given by 

2(2x 2 -3x 3 ) x 2 -x 3 



f"(x) = 12x 2 + (2 - 6x)(l + Vx^Tl) + [ ' + 



Vx 2 + 1 (x 2 + 1) ; 

> wx 2 + (2 - &c)(i + y^sTi) + 2(3 !!^ 3) + 

Vx 2 + 1 (x 2 + l) : 



To show convexity it suffices to show /" > 0. Since for x G [0, 1] it is x 2 > x 3 
it remains to show 



g(x) := 10x 2 + (2 - 6x)(l + \J x 1 + 1) > 0. 

By consideration of the first and second derivatives we see that the minimum 
of g is obtained by x = 3/4 with g(3/4) = 0. Taking everything into account, 
we obtain f"{x) > for all x G [0, 1] which implies that the first derivative 
/' is monotone increasing. 
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By assumption, there exist numbers ai, . . . , «{, > with xi = yi + c^yb+i f°r 
i = 1, . . . , b and Yli=i OLi = 1. The monotonicity of /' and /(0) =0 imply 
with the mean value theorem 

f(xj) ~ f(Vi) = > j,,. = f{yb+i) 

otiVb+i ~ Vb+i 

for some £ G [yi,Xj\ and 77 E [0,y6 + i] C [0, yi]. This finally yields 

b b b+l 

E /o*) ^ E(/(^) + = E 

i=l i=l i=l 

□ 



Proof of Theorem \1.2l As seen in equation (|12p we have for the vector X n 
the recursion formula ([I]) where 



I 2 . 
n 2 





n 



For the operator norm we obtain 

P,,(/ n )||2 p = Pf(/ n )^(/ n )|| op . 

The matrix Af(I n )Ai(I n )is symmetric. Thus, its operator norm is given by 
the largest absolute eigenvalue. Solving the characteristic equation for the 
matrix we obtain that its eigenvalue being larger in absolute value is given 
by 




I I 

1 _ Zlbl + J!± + ( i 
n n z 



n.i 

n 



n- 



+ 1 



L n,i 

n 



with the function / as given in Lemma [6. 21 This yields with Lemma 
Lemma 13.11 



and 



E 

i=l 



where J n = (J n ,i, Jn,2) is the vector of the sizes of the subtrees of a random 
binary search tree, i.e. J n> \ is uniformly distributed on jO, . . . , n — 1} and 
Jn,2 = n — 1 — J n \. By Lemma 2.2 from lAli Khan and Neiningerl f|2007l ) we 
get the stochastic domination condition 



E 

i=l 



Mi n )\\l P ± st i-u(i-u). 
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Considering the toll vector d(I n ,Z) in (fTBj) and (|14p. the boundedness of 
||Z|| implies that its norm is bounded almost surely by some constant D. 
Furthermore, we trivially have X\ = since the tree with one node is only 
the root. The claim follows by Theorem ll.il □ 



In 



Munsonius! (|2010bl ) the asymptotic expansion of the expectation of P n 
and W n is determined. 

Using these results, we obtain asymptotic tail bounds. 

Proof of Corollary With y n = t -^^ = log n + O(l) we obtain by 

Theorem 11.21 because of limn^oo y n = oo, 

P(\P n -E[P n ]\>tE[P n }) 
f\P n -E\P n ]\ 

= P — — > Vn 

\ n 

<ex P ffA^logn + 0(l)l fl-log tb ^ n 



b-lD ° 7 V 4D(6-1 

= exp ^— ^--^tlogn ^log( 2 ) n + \ogt + a + o(l] 

With z n = = y n + 0(1) the claim for TU n follows as well. □ 



3.2 Random linear recursive trees 

In this section we transfer the results for the random 6-ary recursive tree 
with weighted edges to linear recursive trees. In this tree, every node u has 
a weight w u . Starting with the root, the tree grows node by node. In each 
step the new node is attached to a randomly chosen node of the previous 
ones. The probability that node u is chosen is proportional to the weight 
w u of the node. In the case of linear recursive trees the weight is given by 
w u = 1 + j3 deg(-u) where deg(w) is the number of children of u and /3 S IR>o 
is the parameter of the tree. 

Given a random linear recursive tree T n of size n with weight function u \-t 
1 + (b — 2) deg(u) we consider a 6-ary recursive tree T n _i of size n — 1 where 
the edges are weighted by the random vector Z which is obtained by a 
uniformly distributed permutation of the entries of (1,0, ... ,0) G II 6 . In 
particular, we have fi = 1/6. Denote by -P n _i and W n -\ (resp. P n and W n ) 
the internal path length and the Wiener index of T n _i (resp. T n ). 



Proof of Corollary \l-4\ With the notation above, it is shown in 



Munsonius 
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ll2010lJ ) that 

P n = P n _i + n - 1 

holds. Therefore, we have 

Pn — E[P n ) d Pn-l — E[Pn-l] 



n n 
The claim follows immediately by Theorem 11.21 and Corollary 11.31 



□ 



Proof o f Corollary 1 1 . 51 With the notation above, it is shown in iMunsonius 
l|2010lJ ) that 



W n = W n ^ - P n _i + (n - l) 5 



holds. This yields 



P(\W n -E[W n ]\ >tE[W n }) 

= P(\W n -l - E\W n -!] ~ Pn-l + E[P n ^]\ > tE[W n - X \). 



Moreover, in iMunsoniusI ( 2010bl ) it is shown that Var(P n _i) = 0(n 2 ) for 
n — > oo. Applying Chebycheff's inequality, we obtain for any e > 

Jift- 1 ,»-^-.ii >£ L ay 



Since £[lf n _i] = E\W n ]-\-0{n\ogn) = 0(n 2 logn) this yields with Corollary 

E31 



P(\W n -E[W n ]\ >tE[W n }) 

( \Wn-i ~ E[W n ^}\ tE[W n 



+ o ^ 



n- 



< exp (— - 1 - jjtlogn (log( 2 ) n + logt + a + • 



□ 



Remark 3.3. Random plane oriented recursive trees without the order of the 
nodes equal in distribution the random linear recursive tree with parameter 
/3 = 1. Since the internal path length as well as the Wiener index are 
invariant under changing the order of the tree the tail bounds in Corollary 
11.41 and Corollary 11.51 with 6 = 3 provides in particular the corresponding 
tail bounds for the plane oriented recursive tree. 
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4 Lower tail bounds for the Wiener index 



For the number of comparisons made by quickso rt a lower bound for the 
tail is proved in iMcDiarmid and Havwardl (|1996). There, a set of binary 
trees is constructed that has high probability and implies a large num- 
ber of comparisons. They succeeded in finding lower and upper bounds 
which have the same a s ympt otical behavior. This idea is employed by 
Ali Khan and Neiningerl ([20071 ) to prove a lower tail bound for the Wiener 
in dex of binary searc h trees. 



Munsonius! (|2010al . Section 7.2) the construction from I Ali Khan and Neininger 



(j2007h is extended to random 6-ary recursive trees with weighted edges where 



at least one entry of Z is 1. This yields the following lower bound on the 
tail of the distribution of the Wiener index. 

Theorem 4.1. Let W n denote the Wiener index of a random b-ary recursive 
tree of size n with edge weights Z where {Z±, . . . , Zf,} {1} 7^ and Z{ > 0. 
Then we have for fixed t > and n — > oo 

P(\W n -E[W n }\ >tE[W n }) 

> exp [ lo § n (W 2) n + 0(log (3) n) j \ . 

With the transfer results already used in section [3T21 we obtain a lower bound 
for the distribution of the Wiener index of random linear recursive trees. 

Theorem 4.2. Let W n denote the Wiener index of a random linear recursive 
tree of size n with weight function u h-> 1 + (b — 2) deg(u) for b £ IN and 
b > 2. Then we have for fixed t > and n — > oo 

P{\W n+1 - E[W n+1 ]\ > tE[W n+1 }) 

> exp ^-4^^t log n (log (2) n + 0(log (3) n)^j \ . 

Remark 13.31 holds true also for the lower tail bound. 



Remark 4.3. The constants D arising in the results depend on the specific 
toll function which in turn depends on the functional and the tree model 
considered. Since the toll function in f)13|) and ()14[) is only known up to a 
o(l)-term, it is in general not possible to determine this constant. Neverthe- 
less, it is an analytical problem and should be solvable for special functionals 
and tree models. For instance, in the case of th e vector (W n ,P n ) of the bi- 
nary search tree, lAli Khan and Neiningerl (|2007l ) showed D < 1. 
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